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Abstract 

The utilization of English recurrent word combinations -lexical bundles- play a fundamental role in academic 
prose (Karabacak & Qin, 2013). There has been highly limited research about comparing Turkish non-native and 
native English writers’ use of lexical bundles in academic prose in terms of frequency, structure and functions of 
lexical bundles (Bal, 2010; Karabacak & Qin, 2013, Oztiirk, 2014). Therefore, this current research was 
conducted in order to investigate the most frequently used lexical bundles in the academically published articles 
of Turkish non-native and native speakers of English and to investigate whether there was a significant 
difference between native and non-native scholars with respect to the frequency, structures and functions of 
English language lexical bundles. The data were collected from two corpora; 15 scientific articles of native 
speakers and 15 scientific articles of Turkish advanced writers. The investigation included a quantitative analysis 
of the use of three-word lexical bundles and a qualitative analysis of the functions and structures they serve. To 
be more conservative, three-word lexical bundles which occur 40 times per million words and appear in 5 
different texts were described a lexical bundle in this current research. The findings revealed that Turkish 
non-native writers showed underuse and less variation in the use of lexical bundles in their academic prose 
compared to native speakers. 

Keywords: lexical bundles, academic writing, Turkish non-native writers, native English writers 

1. Introduction 

According to the findings by corpus-based studies, it has been widely agreed that lexical bundles are necessary 
building blocks for written discourse (Biber & Conrad, 1999; Cortes, 2006; Flyland, 2008a; Li & Schmitt, 2009). 
Analyses of academic corpora have demonstrated that lexical bundles are widespread in written registers (Biber 
et ah, 2004; Biber & Barbieri, 2007). In one study, lexical bundles were found to constitute 52.3% of the written 
discourse (Erman & Warren, 2000). Therefore, the acquisition of these recurrent word combinations are 
significant for the development of academic writing skills for at least three reasons: Firstly, lexical bundles are 
usually repeated and an essential part of the structural material; Secondly, as they are frequently used, lexical 
bundles are defining markers of successful writing; Finally, these bundles are the combination of grammar and 
vocabulary, thereby lexicogrammatical underpinnings of a language (Coxhead & Byrd, 2007). 

According to some scholars, the frequent use of lexical bundles in academic writing signifies competent 
language user in writing, the absence of these bundles reflects the signal of novice writers (Flaswell, 1991; 
Cortes, 2004; Flyland, 2008a; Chen & Baker, 2010). In this aspect, Cortes (2004) argues that a certain usage of 
lexical bundles is an indication of a competent language user. Similarly, Ellis, Simpson-Vlach and Maynard 
(2008) state that frequently used lexical bundles results in a natural language. 

Flowever, the majority of corpus-based studies have demonstrated that learners’ employment of recurrent 
multi-word combinations is often problematic (Cortes, 2004; Flyland, 2008b; Li & Schmitt, 2009; Chen & Baker, 
2010; Wei & Lei, 2011; Adel & Erman, 2012). According to research, although non-native learners can produce 
a number of native-like formulaic sequences, their limited use of formulaic sequences cause them to overuse 
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such sequences, which makes learners’ writing seem non-native (Li & Schmitt, 2009). Similarly, some studies 
also showed non-native learners overused or underused some lexical bundles in their writing and they used more 
limited and less varied lexical bundles (Allen, 2009; Adel & Erman, 2012). Even advanced non-native English 
learners and second language learners have substantial problems acquiring lexical bundles (Bishop, 2004; 
Karabacak & Qin, 2013). To researcher’s knowledge, relatively few studies have focused on the issue of 
corpus-based studies of Turkish writers’ usage of lexical bundles (Bal, 2010; Karabacak & Qin, 2013; Oztiirk, 
2014). Therefore, this current research was conducted in order to investigate the most frequently used lexical 
bundles in the academically published articles of Turkish non-native and native speakers of English and to 
investigate whether there was a significant difference between native and non-native scholars with respect to the 
structures and functions of English language lexical bundles. 

1.1 Literature Review 

The term of ‘lexical bundle’ was initially created by Biber, et al. (1999) in the thirteenth chapter of the Longman 
Grammar of Spoken and Written English (LGSWE). Biber et al. (1999, p. 990) describe lexical bundles as 
“recurrent expressions, regardless of their idiomaticity, and regardless of their structural status” and as “simply 
sequences of word forms that commonly go together in natural discourse”. Cortes (2004, p. 400) also gives 
another consistent definition of lexical bundles as “extended collocations of three or more words that statistically 
co-occur in a register”. Biber & Conrad (1999, p. 183) identify lexical bundles as “the most frequent recurring 
lexical sequences ..., which can be regarded as extended collocations: sequences of three or more words that 
show a statistical tendency co-occur.”. 

Biber et al. (1999) took a minimal frequency cut-off of at least ten times per million words for a sequence to be 
regarded as a lexical bundle, whereas Biber et al. (2004) have taken a more conservative approach setting a 
relatively high frequency cut-off point that a lexical bundle must recur forty times per million words so as to be 
considered as a lexical bundle. Another prominent characteristic of lexical bundles is that lexical bundles are 
different from idioms. The last distinguishing feature of lexical bundles is that lexical bundles usually perform 
incomplete structural units. 

Among corpus-based studies focusing on native and non-native academic writing, Chen and Baker (2010) 
compared the usage of lexical bundles in native and non-native speakers’ academic writing in order to find out 
the potential trouble spots in SLA. The learner corpus was made up of writing from LI Chinese learners of L2 
English whereas other two corpora were made up of LI writing from native academicians and university 
students. At the end of the study, the findings revealed that there were significant differences and similarities 
between native and non-native academic writing. The use of lexical bundles in native and non-native students’ 
academic writing was similar when compared to native academicians, which included more VP-based bundles 
and discourse markers than native academic writing, “which appears to be a sign of immature writing” (Chen & 
Baker, 2010, p. 44). Moreover, non-native writing underused some high-frequent lexical bundles of native 
academic writing and overused certain lexical bundles which were rarely used in native writing. Another study 
on non-native academic writing was conducted by Wei and Lei (2011) investigating the use of lexical bundles in 
the academic writing of advanced Chinese EFL learners. The findings collected from the study demonstrated that 
advanced learner writers made use of much more lexical bundles and much more varied lexical bundles in their 
academic writing than professional writers. Similarly, Adel and Erman (2012) investigated the use of 
English-language lexical bundles in advanced learner writing by LI speakers of Swedish and native speakers 
who were undergraduate students of linguistics. The results of the study showed that non-native speakers showed 
an inclination to use more limited and less diverse lexical bundles than native speakers. 

Nevertheless, the research on corpus-based studies of Turkish writers’ usage of lexical bundles was quite 
restricted. The first study conducted by Bal (2010) investigated the use of four-word lexical bundles in the 
research articles of Turkish writers. The most frequent lexical bundles used were ‘on the other hand, the end of 
the, as well as the, in the case of and one of the most’ in TSRAC. The researcher classified these bundles 
structurally and functionally. Oztiirk (2014) investigated the usage of Turkish and native English postgraduate 
students’ and native writers in a specific academic discipline with regard to the structures, functions and 
frequency of lexical bundles using the control corpus. The results of the study showed that Turkish postgraduate 
students made use of lexical bundles more frequently than native students and writers. Nevertheless, Turkish 
postgraduate students overused most of the lexical bundles. Lastly, Karabacak and Qin (2013) investigated the 
comparison of the use of lexical bundles in the argumentative papers of three groups of university writers; 
Turkish, Chinese and Americans. The findings gathered from the study indicated that even advanced English 
learners had difficulty in acquiring some lexical bundles through simple exposure. As there have been highly 
limited studies on the issue of Turkish writers’ usage of lexical bundles (Bal, 2010; Karabacak & Qin, 2013; 
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Oztiirk, 2014), this current study makes an attempt to answer the research questions below: 

1) What are the most frequently used three-word lexical bundles in the academically published articles of 
Turkish non-native and native speakers of English? 

2) Are there any significant differences between native and non-native scholars with respect to the structures 
and functions of English language three-word lexical bundles? 

2. Method 

This part explains the features of the research corpora (expert and learner corpora) and how it was compiled. 
Then, the structural and functional taxonomy of lexical bundles were identified in detail. 

2.1 Expert Corpus 

The expert corpus was made up of 15 scientific articles written by English native speakers in the disciplines of 
Theoretical and Applied Linguistics and English Language Teaching (144.451 running words) within a certain 
time interval of the last 11 years (between 2005-2016). The scientific articles were gathered from distinguished 
journals; Journal of Pragmatics, Lingua, English for Specific Purposes, System, Teaching and Teacher Education, 
Learning and Individual Differences, Procedia-Social and Behavioural Sciences and Cognition. 

2.2 Learner Corpus 

The learner corpus was made up of 15 scientific academic articles published by Turkish non-native writers in the 
same disciplines of Theoretical and Applied Linguistics and English Language Teaching (124.250 running words) 
within a certain time interval of the last 11 years (between 2005-2016). The scientific articles were collected 
from the distinguished journals as follows: Journal of Second Language Writing, Lingua, Journal of Pragmatics, 
Journal of English for Academic Purposes, System, Procedia- Social and Behavioral Sciences, Computer 
Assisted Language Learning and Teaching and Teacher Education. The criteria used to collect the learner and 
control corpora were the particular fields of linguistics and English language teaching and writers’ native 
languages. 

Table 1 shows the quantity of running words and scientific articles used in learner and control corpora. 


Table 1. Number of words and articles in learner and control corpora 



Learner Corpus 

Expert Corpus 

Number of words 

124.250 

144.451 

Number of scientific articles 

15 

15 


After the collection of scientific articles, all tables, references, figures and charts were removed from the texts to 
prepare them for analysis. The present study focused on three-word lexical bundles as three-word lexical bundles 
are more frequently used in academic writing than longer lexical bundles. To be more conservative, Biber et al.’s 
(2004) frequency approach was adopted by the researcher. Three-word lexical bundles which occur 40 times per 
million words and appear in 5 different texts were described a lexical bundle in this current research. Ant Cone 
3.4.4 programme was used in this research to discover lexical bundles. This programme made a list of the 
three-word lexical bundles requiring the cut-off points of at least 40 occurences in 5 different texts in the corpus. 
Furthermore, the comparisons were made between the learner and expert’s corpora to find out differences of 
structures, frequencies and discourse patterns of usage of the most frequent lexical bundles used in native and 
non-native academic writing. 

2.3 Structural Taxonomy of Lexical Bundles 

Biber et al’s (1999) structural taxonomy was adopted by the researcher as it was the first and only taxonomy 
developed by Biber et al (1999) in the book called ‘Longman Grammar of Spoken and Written English’ (shown 
in Table 2). 
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Table 2. Structural taxonomy of lexical bundles by Biber et al. (1999, pp. 1014-1024) 
Structure Sample Bundles 


1. Noun phrase with of- phrase fragment 

2. Noun phrase with other post-modifier fragments 

3. Prepositional phrase with embedded of-phrase 

4. Other prepositional phrase (fragment) 

5. Anticipatory it + verb / adjective phrase 

6. Passive verb+prepositional phrase fragment 

7. Copula be + noun / adjective phrase 

8. (Verb phrase+) that- clause fragment 

9. (Verb/ adjective +) to-clause fragment 

10. Adverbial clause fragment 

11. Pronoun/ noun phrase+ be (+...) 

12. Other expressions 


the beginning of the, the shape of the 

the way in which, the extent to which 

as a result of, in the case of fragment 

at the same time, on the other hand 

it is possible to, it should be noted that 

is shown in figure, is based on the 

is one of the, is part of the, is due to the 

has been shown that, that there is no 

are likely to be, has been shown to, to be able to 

as we have seen, if there is a 

this is not the, there was no significant 

as well as the, than that of the 


2.4 Functional Taxonomy of Lexical Bundles 

After the structural classification of lexical bundles, Biber et al. (2004) developed a functional distribution of 
lexical bundles for conversation and academic prose. Three preliminary functions were employed by lexical 
bundles: stance bundles, discourse organizers and referential bundles. These functions were defined as (Biber et 
al., 2004, p. 384): 

“Stance bundles express attitudes or assessments of certainty that frame some other proposition. Discourse 
organizers reflect relationships between prior and coming discourse. Referential bundles make direct reference to 
physical or abstract entities, or to the textual context itself, either to identify the entity or to single out some 
particular attribute of the entity as especially important.”. 

In the current study, the lexical bundles were categorized functionally depending on these three functions, and 
when necessary, concordance lines were controlled in order to find out the functions of lexical bundles. 

3. Results 

3.1 Overall Frequencies between the Corpora 

First of all, the overall number of lexical bundles in native and non-native writing was identified. Table 3 has 
demonstrated the overall frequencies of lexical bundles in the academically published articles of Turkish 
non-native and native speakers of English. 


Table 3. Total frequencies of three-word lexical bundles in each corpus 



Bundle Types 

Bundle Tokens 

Corpus Size 

Turkish Non-native Corpus 

7 

523 

124.250 

English Native Corpus 

10 

513 

144.451 


As can be seen in Table 3, Turkish non-native academic writing (n= 523) employed higher three word lexical 
bundles than that of native professional writers (n=513). However, Turkish non-native academic articles showed 
less varied lexical bundles (7 bundle types) compared with the academic writing of English native speakers (10 
bundle types). Therefore, it can be concluded that although Turkish non-native writers have a tendency to 
employ higher number of three-word lexical bundles patterns identified in academic writing, they use less varied 
lexical bundles than professional writers. 

Furthermore, frequencies per million words and per texts were also calculated in order to compare the 
standardized findings between the corpora. 
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Table 4. Raw frequencies and frequencies per million words & texts for the most frequent lexical bundles in the 


corpora 


Raw Frequency 

Frequency per million word 

Frequency per text 

Learner Corpus 

523 

4209 

26.15 

Expert Corpus 

513 

3551 

25.65 


As shown in Table 4, Turkish non-native speakers used three-word lexical bundles more often (4209 occurrences 
per million words) than English native speakers (3551 occurrences per million words). Results demonstrated that 
native professional writers employed these lexical bundles an average of 25.65 per text in academic writing 
while non-native writers (Turkish) made use of it an average of 26.15 per text in their written discourse. 

In terms of the three-word lexical bundle type and range, Table 5 demonstrates the most frequent bundles in 
academically written texts of native and non-native writers. 


Table 5. The most frequent three-word lexical bundles in the corpora 


Turkish 

Bundle 

Type 

Bundle 

Tokens 

Bundle 

Range 

English 

Bundle 

Type 

Bundle 

Tokens 

Bundle Range 

the use of 

170 

13 

the use of 

81 

12 

in terms of 

82 

13 

as well as 

66 

14 

in order to 

67 

12 

the fact that 

62 

8 

the present study 

58 

10 

in terms of 

49 

9 

with respect to 

57 

9 

on the other 

45 

8 

of the participants 

48 

9 

use of the 

45 

9 

as well as 

41 

10 

one of the 

43 

13 




part of the 

42 

10 




in order to 

40 

10 




the other hand 

40 

6 

Total 

523 



513 



According to the Table 5 above, the most frequently used three-word lexical bundle in Turkish academic written 
texts was ‘ the use of’, which was employed 170 times and also the most frequent bundle in English academically 
written texts with a frequency of 81 times. Comparing the two corpora regarding the most frequently used lexical 
bundles, four of these bundles were shared bundles used by both native and non-native writers. These bundles 
were ‘the use of, in terms of, in order to and as well as’. However, except the bundle of ‘as well as’, bundle 
tokens of these bundles were much higher in Turkish non-native texts than those of native professional writers. 

3.2 Structures of Lexical Bundles 

The lexical bundles were categorized structurally depending on the structural taxonomy of Biber et al. (1999). 
Table 6 shows the distribution of structures employed by Turkish non-native and English native texts. 
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Table 6. Structures of three-word lexical bundles in the 

corpora 




Structure 

Turkish 

Corpus 

Non-native 

English Native 

Corpus 


1. 

Noun phrase with of- phrase fragment 

the use of 


the use of, use of the, 





one of the, part 

of the 

2. 

Noun phrase with other post-modifier fragments 



the fact that 


3. Prepositional phrase with embedded of-phrase 
fragment 

in terms of 


in terns of 


4. 

Other prepositional phrase (fragment) 

in order to, with respect 
to, of the participants 

on the other 
order to 

(hand),in 

5. 

Anticipatory it + verb / adjective phrase 

- 


- 


6. 

Passive verb + prepositional phrase fragment 

- 


- 


7. 

Copula be + noun / adjective phrase 

- 


- 


8. 

(Verb phrase+) that- clause fragment 

- 


- 


9. 

(Verb/ adjective +) to-clause fragment 

- 


- 


10. 

Adverbial clause fragment 

- 


- 


11. 

Pronoun/ noun phrase+ be (+...) 

- 


- 


12. 

Other expressions 

as well as 


as well as 



According to the Table 6, English native writing included slightly more NP with of phrase fragment (use of the, 
one of the, part of the) while Turkish non-native academic writing tended to use more other PP (with respect to, 
in order to, of the participants) compared with native texts. Nevertheless, four types of structures of lexical 
bundles were used by both native and non-native writers; NP with of phrase (the use of, use of the, one of the, 
part of the), PP with of phrase fragment (in terms of), other PP (in order to, with respect to, of the participants, on 
the other (hand)) and other expressions (as well as). 

3.3 Functions of the Lexical Bundles 

The three-word lexical bundles were classified functionally based on the functional taxonomy of Biber et al. 
(2004). Figure 1 demonstrates the functional distribution of lexical bundles employed by native and non-native 
texts. 


FUNCTIONS OF THE BUNDLES 


7 



Turkish Non- English Native 
native Academic Academic Writing 
Writing " Referential Bundles 


• Discourse Organizers 


■ Stance Bundles 


Figure 1. Functions of three-word lexical bundles 


According to Figure 1, English native writers employed slightly more referential bundles than their non-native 
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counterparts in order to make direct reference to physical or abstract entities, or to the textual context itself. 
Native texts also employed other bundle types (stance and discourse organizers) more than Turkish non-native 
writers. For example, the stance bundle “the fact that” was only used by native writers: 

• “Particularly in view of the fact that the analyses in this paper are not corpus-based, it is beyond the scope 
of this paper to consider in detail ways in which institutional setting, text-type, or other text-categorizations.. 

“On the other (hand)” was another discourse organizer bundle employed by English native writers: 

• “Biologists, on the other hand, made considerable use of resultative markers, bundles which introduce 
writer’s interpretations and understandings of research processes" 

As for the referential bundles, “one of the” and “part of the” were the referential bundles which were employed 
in native texts but not in non-native texts: 

• “As the analysis shows, one of the primary’ characteristics, and (perhaps) goals of the quiz game is that 
students perform as student-contestants 

• “High pitch level in student answer bids can be considered one part of the physical and discursive practice 
of students in the co-construction..." 

On the other hand, two of the bundles (with respect to, of the participants) were only used by Turkish non-native 
writers: 

• “...with a strong tendency to appeal to people s inner, true selves both 
with respect to their emotions and their inner wishes and aspirations 

• “...and clarify why some of the participants attempted categorizations or commented on groupings as 
displayed in the excerpts .” 

• “The initial comments of the participants indicate the power of the writing teacher and her expectations 
from the students." 

4. Discussion 

The present study was conducted to investigate the most frequently used lexical bundles in the academically 
published articles of Turkish non-native and English native speakers and to investigate whether there was a 
significant difference between native and Turkish non-native scholars with respect to the frequency, structures 
and functions of English language lexical bundles. The data were collected from two corpora; 15 scientific 
articles of native speakers and 15 scientific articles of Turkish advanced writers. The investigation included a 
quantitative analysis of the use of three-word lexical bundles and a qualitative analysis of the functions and 
structures they serve. To be more conservative, three-word lexical bundles which occur 40 times per million 
words and appear in 5 different texts were described a lexical bundle in this current research. 

The findings gathered from the study demonstrated that although Turkish non-native writers employed higher 
number of three-word lexical bundles patterns identified in academic writing, they use less varied lexical bundles 
than English professional writers. Regarding the most frequent three-word lexical bundles in native and 
non-native academic writing, the most frequently used three-word lexical bundle in Turkish academic written 
texts was ‘ the use of’, which was employed twice more than those of native writers. Four of the most frequent 
bundles were shared bundles used by both native and non-native writers. These bundles were 'the use of, in 
terms of, in order to and as well as’. As for the structure taxonomy of lexical bundles, English native writing 
included slightly more NP with of phrase fragment (use of the, one of the, part of the) while Turkish non-native 
academic writing used more other PP (with respect to, in order to, of the participants) compared with native texts. 
Lastly, regarding the functions of lexical bundles, Turkish non-native writers used more referential bundles in 
their academic writing compared to other bundle types. Flowever, they use less varied bundles when compared to 
native texts. 

The results of the present study are consistent with the previous studies that showed non-native learners overused 
or underused some lexical bundles in their writing and they used more limited and less varied lexical bundles 
(Allen, 2009; Adel & Erman, 2012; Li & Schmitt, 2009). Adel and Erman (2012) conducted a study to 
investigate the use of English-language lexical bundles in advanced learner writing by L1 speakers of Swedish 
and native speakers who were undergraduate students of linguistics. The findings showed that non-native 
speakers showed an inclination to use more limited and less diverse lexical bundles than native speakers. 
Another study conducted by Bal (2010) demonstrated that the most frequent four-word lexical bundles in 
Turkish academic writing were “on the other hand, as well as the, and one of the most’ which were consistent 
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with the current study although the current study focused on three-word lexical bundles. Another study 
conducted by Oztiirk (2014) concluded that Turkish non-native writers used lexical bundles more frequently than 
native writers which is in line with the finding of the current study. 

4.1 Pedagogical Implications 

Studies demonstrated that lexical bundles are not acquired in a natural way, even simple exposure to the lexical 
bundles is not enough for learners to use the lexical bundles actively (Cortes, 2004, 2006; Karabacak & Qin, 
2013; Wei & Lei, 2011). Even advanced learners have substantial problems on lexical bundles (Bishop, 2004; 
Karabacak & Qin, 2013). Therefore, entailing deep level of processing, explicit teaching of lexical bundles might 
be one of the solutions the language instructors might use to foster learners’ acquisition process of lexical 
bundles in their writing. 

It is also clear that lexical bundles are acquired incrementally just like single words. (Schmitt, 2000; Nation, 
2001; Schmitt et al., 2004; Li & Schmitt, 2009; Colovic-Markovic, 2012). Based on this fact, learners are in need 
of a large amount of repeated exposures to acquire lexical bundles. In this aspect, noticing, retrieval and 
generative activities such as rephrasing (Peters & Pauwels, 2015), substitution tasks (Salazar, 2014), writing 
activities (Nation, 2001) or Google search applications, techniques and tasks (Zengin, 2009; Zengin & Ka 9 ar, 
2015) are some of many ways that writing instructors can benefit in the EFL classroom to enhance learners’ 
successful acquisition and retention of these multiword combinations. 

Material developers and writing course designers can design materials including multi-word combinations in 
textbooks of writing classes in language programs with limited or extended contexts from Coca to enhance 
in-depth knowledge of the uses and functions of lexical bundles. 

4.2 Limitations 

There are several limitations to the current study. The results of this study need to be treated with some caution 
since the the corpora include only one academic discipline and a small corpus size; they cannot be generalized to 
all the disciplines. Further research could be conducted with more disciplines and corpus size. 
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